Goto

Collaborating Authors

 British Columbia


Orbital AI data centers could work, but they might ruin Earth in the process

Engadget

Samsung Galaxy Unpacked 2026 is Feb. 25 A single collision could cause a cascading effect in orbit. Elon Musk's plan to launch millions of AI satellites could be disastrous for the planet. At the start of the month, Elon Musk announced that two of his companies -- SpaceX and xAI -- were merging, and would jointly launch a constellation of 1 million satellites to operate as orbital data centers. Musk's reputation might suggest otherwise, but according to experts, such a plan isn't a complete fantasy. However, if executed at the scale suggested, some of them believe it would have devastating effects on the environment and the sustainability of low Earth Earth orbit.



A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems Yi Ma

Neural Information Processing Systems

To address this problem, existing methods partition the overall DPDP into fixed-size sub-problems by caching online generated orders and solve each sub-problem, or on this basis to utilize the predicted future orders to optimize each sub-problem further. However, the solution quality and efficiency of these methods are unsatisfactory, especially when the problem scale is very large.


How Two Zoomers Created RentAHuman, the First Marketplace for Bots to Hire Humans

WIRED

WIRED spoke with the Zoomer founders of a platform where AI agents hire humans to do real-world tasks. Their pitch: People would love to have a clanker as their boss. For centuries, people have catastrophized about robots taking away jobs. On February 1, the paradigm shifted: bots are jobs. Now, 518,284 humans--and rapidly counting--are offering their labor to AI agents on a new online marketplace called RentAHuman . There are classifieds to count pigeons in Washington ($30/hour); deliver CBD gummies ($75/hour); play exhibition badminton ($100/hour); and anything else you could possibly imagine that a disembodied agent couldn't do.




Debiasing Conditional Stochastic Optimization Lie He

Neural Information Processing Systems

The sample-averaged gradient of the CSO objective is biased due to its nested structure, and therefore requires a high sample complexity for convergence. We introduce a general stochastic extrapolation technique that effectively reduces the bias.



1 Details about the observation formats Figure 1: Example of the observation of WebShop The observation of WebShop is simplified based on the text_rich

Neural Information Processing Systems

The observation of WikiHow is represented in exactly the same way with Zhang et al. [2023]. Table 1: Patterns of WebShop pages Pattern Description search The page to search for an item itemlisting The page listing the search results item The information page of a specific item others The item description page, item feature page, and review pageThe similarity lookup table is defined in Table 2. 1 Table 2: Lookup table of the page similarity of WebShop search itemlisting item others search 1 0 0 0 itemlisting 0 1 0 0 item 0 0 1 0.3 others 0 0 0.3 1 2.2 Lookup table of the instruction similarity function of WikiHow Table 3. Table 3: Patterns of WikiHow instructions Pattern Name Pattern Template search Search an article to learn . . . Owing to the limit of budgets, a subset of only 20 tasks is sampled from the full test set. The visualization is available in Figure 2. It can be seen that the performance of R However, there seems to be a saturation for the performance, which may be attributed to the limited number of the active exemplars and training tasks. The saturation of the average reward comes later than that of the success rate. Double Q-Learning [van Hasselt, 2010] is usually leveraged to ameliorate over-estimation for lookup-based Q-Learning.